339 results found.
Written
Treebank,
Language Type:
Multilingual
Languages:
Japanese
Availability:
The original corpus is required (non-free)
License:
<Not Specified>
Size:
78 MByte (the whole UD treebanks) MByte Production Status:
Newly created-in progress
Use:
Parsing and Tagging
-
Paper title:Universal Dependencies for Japanese
-
Paper track:Written
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Takaaki Tanaka | NTT CS lab | JP |
| Author 2 | Yusuke Miyao | National Instutite of Informatics | JP |
| Author 3 | Masayuki Asahara | National Institute for Japanese Language and Linguistics | JP |
| Author 4 | Sumire Uematsu | National Institute of Informatics | JP |
| Author 5 | Hiroshi Kanayama | IBM Research - Tokyo | JP |
| Author 6 | Shinsuke Mori | Kyoto University | JP |
| Author 7 | Yuji Matsumoto | Nara Institute of Science and Technology | JP |
| Main Contact | Takaaki Tanaka | NTT CS lab | None |
Documentation:
<Not Specified>Language Type:
Trilingual
Languages:
English Japanese Mandarin Chinese
Availability:
Planned
License:
<Not Specified>
Size:
<Not Specified> <Not Specified>Production Status:
Newly created-in progress
Use:
Semantic Similarity
-
Paper title:Extending Monolingual Semantic Textual Similarity Task to Multiple Cross-lingual Settings
-
Paper track:Written
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Yoshihiko Hayashi | Waseda University | JP | Osaka University | JP |
| Author 2 | Wentao Luo | Osaka University | JP | ||
| Main Contact | Yoshihiko Hayashi | Waseda University | None |
Documentation:
<Not Specified>
Speech/Written
Evaluation Data,
Language Type:
Multilingual
Languages:
Japanese
Availability:
From Data Center(s)
License:
<Not Specified>
Size:
35 <Not Specified>Production Status:
Newly created-in progress
Use:
Information Extraction, Information Retrieval
-
Paper title:Designing an Evaluation Framework for Spoken Term Detection and Spoken Document Retrieval at the NTCIR-9 SpokenDoc Task
-
Paper track:Speech
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Tomoyosi Akiba | Toyohashi University of Technology | None |
| Author 2 | Hiromitsu Nishizaki | University of Yamanashi | None |
| Author 3 | Kiyoaki Aikawa | Tokyo University of Technology | None |
| Author 4 | Tatsuya Kawahara | Kyoto University | None |
| Author 5 | Tomoko Matsui | <Not Specified> | None |
| Main Contact | Tomoyosi Akiba | Toyohashi University of Technology | JP |
Documentation:
<Not Specified>
Written
Language Resources/Technologies Infrastructure,
Language Type:
Multilingual
Languages:
American English Japanese
Availability:
From Data Center(s)
License:
<Not Specified>
Size:
10 GByte Production Status:
Existing-used
Use:
Question Answering
-
Paper title:Overview of Todai Robot Project and Evaluation Framework of its NLP-based Problem Solving
-
Paper track:Infrastructural Issues/Large Projects
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Akira Fujita | National Institute of Informatics | JP |
| Author 2 | Akihiro Kameda | National Institute of Informatics | JP |
| Author 3 | Ai Kawazoe | National Institute of Informatics | JP |
| Author 4 | Yusuke Miyao | National Institute of Informatics | JP |
| Main Contact | Akira Fujita | National Institute of Informatics | None |
Documentation:
<Not Specified>
Written
Corpus,
Language Type:
Multilingual
Languages:
English Japanese
Availability:
Freely Available
License:
Creative Commons
Size:
8000 entries Production Status:
Existing-used
Use:
Bilingual Lexicon Extraction
-
Paper title:Bilingual Segmented Topic Model
-
Paper track:Empirical/Data-Driven
-
Paper status:Accept - Poster - Monday
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Akihiro Tamura | National Institute of Information and Communications Technology | JP | ||
| Author 2 | Eiichiro Sumita | National Institute of Information and Communications Technology | JP | National Institute of Information and Communications Technology | N/A |
| Main Contact | Akihiro Tamura | National Institute of Information and Communications Technology | None |
Documentation:
<Not Specified>
Multimodal/Multimedia
Corpus,
Language Type:
Monolingual
Languages:
Japanese
Availability:
Not Available
License:
Size:
130 sessions of interactions OtherProduction Status:
Newly created-in progress
Use:
Corpus Creation/Annotation
-
Paper title:Semi-supervised learning for character expression of spoken dialogue systems
-
Paper track:11.1 Spoken dialog systems/Oral Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Kenta Yamamoto | ERATO Human-Robot Interaction Corpus | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Monolingual
Languages:
Japanese
Availability:
From Data Center(s)
License:
Size:
661 hours Production Status:
Existing-used
Use:
Speech Synthesis
-
Paper title:Investigating Effective Additional Contextual Factors in DNN-based Spontaneous Speech Synthesis
-
Paper track:7.10 Expression, emotion and personality generatio/Poster Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yuki Yamashita | Corpus of Spontaneous Japanese | /N |
Documentation:
None
Multimodal/Multimedia
Corpus,
Language Type:
Monolingual
Languages:
Japanese
Availability:
Public release in preparation
License:
Size:
3.0 GByte Production Status:
Newly created-finished
Use:
Dialogue
-
Paper title:Gaming Corpus for Studying Social Screams
-
Paper track:3.6 Social signal processing/Poster Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Hiroki Mori | Action Game Speech Communication Corpus | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Monolingual
Languages:
Chinese Dutch Finnish French German Greek Hungarian Japanese Russian Spanish
Availability:
Freely Available
License:
Apache-2.0
Size:
None Production Status:
Existing-used
Use:
Speech Synthesis
-
Paper title:One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech
-
Paper track:7.14 Cross-lingual and multilingual aspects in spe/Poster Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Tomáš Nekvinda | CSS10 | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Multilingual
Languages:
Dari/Pashto Dutch English Finnish French Hindi Icelandic Indonesian Japanese Lithuanian Malay Mandarin Nepali Portuguese Punjabi Romanian Slovenian Spanish
Availability:
From Owner
License:
CreativeCommons
Size:
467 hours Production Status:
Newly created-finished
Use:
Person Identification
-
Paper title:JukeBox: A Multilingual Singer Recognition Dataset
-
Paper track:4.3 Speaker verification and identification/Oral Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Anurag Chowdhury | JukeBox | /N |
Documentation:
Documentation in English language will be made available upon publication of the dataset.




